Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inclusion: bench enact_candidate weight #5270

Merged
merged 62 commits into from
Aug 29, 2024
Merged

Conversation

ordian
Copy link
Member

@ordian ordian commented Aug 7, 2024

On top of #5082.

Background

Previously, before #3479, we would include the cost enacting the candidate into the cost of processing a single bitfield. Now it is different, although the benchmarks seems to be not-up-to date.
Including the cost of enacting a candidate into a processing a single bitfield cost was incorrect, since we multiple that by the number of bitfields we have. Instead, we should separate calculate the cost of processing a single bitfield without enactment, and multiple the cost of enactment by the actual number of processed candidates (which is limited by the number cores, not validators).

Bench

Previously, the weight of enact_candidate was calculated manually (without a benchmark) and then neglected:

let _weight = Self::enact_candidate(

In this PR, we have a benchmark for it and it's based on the number of ump and sent hrmp messages as well as whether the candidate has a runtime upgrade (new_validation_code).
The differences from the previous attempt paritytech/polkadot#6929 are that

  • we don't include the cost of enactment into the cost of processing a backed candidate.
    The reason for it is that enactment happens not in the same block as backing (typically the next one), since we process bitfields before backing votes.
  • we don't take into account the size of the runtime upgrade, the benchmark weight doesn't seem to depend much on it, but rather whether there was one or not.

Similarly to the previous attempt, we don't account for dmp messages (fixed cost). Also we don't account properly for received hrmp messages (hrmp_watermark) because the cost of it depends on the runtime state and can't be statically deduced in the benchmark (unless we pass the information about channels as benchmark u32 arguments).

The total weight cost of processing a parainherent now includes the cost of enactment of each candidate, but we don't do filtering based on that (because we enact after processing bitfields and making other changes to the storage).

Numbers

Reads = 7 + (0 * u) + (3 * h) + (8 * c)
Writes = 10 + (1 * u) + (3 * h) + (7 * c)

In addition, there is a fixed cost of a few of ms (!) per candidate.

This might result a full block slightly overflowing its weight with 200 enacted candidates, which in turn could prevent non-mandatory transactions from being included in a block.

Given our modest limits on max ump and hrmp messages:

  maxUpwardMessageNumPerCandidate: 16
  hrmpMaxMessageNumPerCandidate: 10

and the fact that runtime upgrades are can't happen very frequently (validation_upgrade_cooldown), we might only go over the limits in case of many disputes.

TODOs:

  • Fix the overweight test
  • Generate the weights for Westend and Rococo
  • PRDoc

ordian and others added 16 commits July 19, 2024 17:57
…=rococo --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent
…=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent
* master:
  Bump slotmap from 1.0.6 to 1.0.7 (#5096)
  feat: introduce pallet-parameters to Westend to parameterize inflation (#4938)
  Bump openssl from 0.10.64 to 0.10.66 (#5107)
  Bump lycheeverse/lychee-action from 1.9.1 to 1.10.0 (#5094)
  docs: remove the duplicate word (#5095)
  Prepare PVFs if node is a validator in the next session (#4791)
  Update parity publish (#5105)
* master: (27 commits)
  Bridges improved tests and nits (#5128)
  Fix misleading comment about RewardHandler in epm config (#3095)
  Introduce a workflow updating the wishlist leaderboards (#5085)
  membership: Restructure pallet into separate files (#4536)
  Fix after ring-proof api change (#5126)
  Bump paritytech/review-bot from 2.4.0 to 2.5.0 (#5057)
  Bump docker/login-action from 3.0.0 to 3.3.0 (#5109)
  Bump docker/build-push-action from 5.1.0 to 6.5.0 (#5108)
  Bump peter-evans/create-pull-request from 5.0.0 to 6.1.0 (#5093)
  Tx Payment: drop ED requirements for tx payments with exchangeable asset  (#4488)
  Remove `pallet-getter` usage from pallet-transaction-payment (#4970)
  pallet macro: do not generate try-runtime related code when frame-support doesn't have try-runtime. (#5099)
  fix(chain-spec): ChainSpecBuilder with object as default genesis (#4345)
  Migrate BEEFY BLS crypto to  bls12-381 curve (#4931)
  Bump clap from 4.5.9 to 4.5.10 in the known_good_semver group (#5120)
  Use jobserver in wasm-builder to limit concurrency of spawned cargo processes (#4946)
  include events for voting (#4613)
  [subsystem-bench] Add mocks for own assignments triggering (#5042)
  Remove not-audited warning (#5114)
  hotfix: blockchain/backend: Skip genesis leaf to unblock syncing (#5103)
  ...
* master: (51 commits)
  Remove unused feature gated code from the minimal template (#5237)
  make polkadot-parachain startup errors pretty (#5214)
  Coretime auto-renew (#4424)
  network/strategy: Backoff and ban overloaded peers to avoid submitting the same request multiple times (#5029)
  Fix frame crate usage doc (#5222)
  beefy: Tolerate pruned state on runtime API call (#5197)
  rpc: Enable ChainSpec for polkadot-parachain (#5205)
  Add an adapter for configuring AssetExchanger (#5130)
  Replace env_logger with sp_tracing (#5065)
  Adjust sync templates flow to use new release branch (#5182)
  litep2p/discovery: Publish authority records with external addresses only (#5176)
  Run UI tests in CI for some other crates (#5167)
  Remove `pallet::getter` usage from the pallet-balances (#4967)
  pallet-timestamp: `UnixTime::now` implementation logs error only if called at genesis (#5055)
  [CI] Cache try-runtime check (#5179)
  [Backport] version bumps and the prdocs reordering from stable2407 (#5178)
  [subsystem-benchmark] Update availability-distribution-regression-bench baseline after recent subsystem changes (#5180)
  Remove pallet::getter usage from proxy (#4963)
  Remove pallet::getter macro usage from pallet-election-provider-multi-phase (#4487)
  Review-bot@2.6.0 (#5177)
  ...
@ordian ordian added T8-polkadot This PR/Issue is related to/affects the Polkadot network. T12-benchmarks This PR/Issue is related to benchmarking and weights. labels Aug 7, 2024
@ordian
Copy link
Member Author

ordian commented Aug 7, 2024

bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::inclusion --runtime=westend
bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::inclusion --runtime=rococo
bot clean

@command-bot
Copy link

command-bot bot commented Aug 7, 2024

@ordian Command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::inclusion has finished. Result:

HttpError: Not Found
HttpError: Not Found
    at /app/node_modules/@octokit/request/dist-node/index.js:86:21
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async sendRequestWithRetries (/app/node_modules/octokit-auth-probot/node_modules/@octokit/auth-app/dist-node/index.js:466:12)
    at async Job.doExecute (/app/node_modules/bottleneck/light.js:405:18)

@command-bot command-bot bot deleted a comment from github-actions bot Aug 7, 2024
@command-bot
Copy link

command-bot bot commented Aug 7, 2024

@ordian Command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=rococo --target_dir=polkadot --pallet=polkadot_runtime_parachains::inclusion has finished. Result:

HttpError: Not Found
HttpError: Not Found
    at /app/node_modules/@octokit/request/dist-node/index.js:86:21
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async sendRequestWithRetries (/app/node_modules/octokit-auth-probot/node_modules/@octokit/auth-app/dist-node/index.js:466:12)
    at async Job.doExecute (/app/node_modules/bottleneck/light.js:405:18)

command-bot and others added 6 commits August 7, 2024 11:23
…=rococo --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent
…=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent
* master:
  Umbrella crate: exclude chain-specific crates (#5173)
  Bring reference_hardware.json inline with machine used for weights (#5196)
  Snowbridge on Westend (#5074)
  Run semver check even when no prdoc (#5189)
  Export more from sc-service (#5250)
  Update the wishlist leaderboard script to handle PRs (#5256)
Copy link
Contributor

@alindima alindima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work!

polkadot/runtime/parachains/src/inclusion/mod.rs Outdated Show resolved Hide resolved
let max_hrmp_msgs = config.hrmp_max_message_num_per_candidate;
// No bitfields - no enacted candidates
let bitfield_size = bitfields.first().map(|b| b.unchecked_payload().0.len()).unwrap_or(0);
set_proof_size_to_tx_size(
Copy link
Contributor

@sandreim sandreim Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should get rid of tracking the proof size in polkadot-runtime-parachains, the BlockWeight proof size limit is set to u64::MAX on the relaychain. It only makes sense for parachains to measure it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but it is orthogonal to this PR and I'd leave to for the later refactoring aka the long-term fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ordian @sandreim No, we actually use this to ensure we don't overfill the block with transactions.

github-merge-queue bot pushed a commit that referenced this pull request Aug 29, 2024
closes #849

## Context

For the background on this and the long-term fix, see
#849 (comment).

## Changes

* The weigh files are renamed from `runtime_(parachains|common).*` to
`polkadot_runtime_(parachains|common).*`. The reason for it is the
renaming introduced in #4633. The new weight command and files are
generated now include `polkadot_` prefix.
* The WeightInfo for `paras_inherent` now includes `enter_empty` which
calculates the cost of processing an empty parachains inherent. This
cost is subtracted dynamically when calculating other weights (so the
other weights remain the same)

## Benefits

See
#849 (comment),
but TL;DR is that we are not blocked on weights for scaling the number
of validators and cores further.

Resolved questions:
- [x] why new benchmarks for westend are doing fewer db IOPS?
Is it due polkadot-sdk update (db IOPS diff)?
or the bench setup is no longer valid?

https://github.com/polkadot-fellows/runtimes/blob/7723274a2c5cbb10213379271094d5180716ca7d/relay/polkadot/src/weights/runtime_parachains_paras_inherent.rs#L131-L196
Answer: see background section of #5270 

TODOs:
- [x] Rerun benchmarks for Rococo and Westend
- [x] PRDoc

---------

Co-authored-by: command-bot <>
Base automatically changed from ao-fix-parainclusion-weight-overestimation to master August 29, 2024 09:16
* master: (39 commits)
  short-term fix for para inherent weight overestimation (#5082)
  CI: Add backporting bot (#4795)
  Fix benchmark failures when using `insecure_zero_ed` flag (#5354)
  Command bot GHA v2 - /cmd <cmd> (#5457)
  Remove pallet::getter usage from treasury (#4962)
  Bump blake2b_simd from 1.0.1 to 1.0.2 (#5404)
  Bump rustversion from 1.0.14 to 1.0.17 (#5405)
  Bridge zombienet tests: remove old command (#5434)
  polkadot-parachain: Add omni-node variant with u64 block number (#5269)
  Refactor verbose test (#5506)
  Use umbrella crate for minimal template (#5155)
  IBP Coretime Polkadot bootnodes (#5499)
  rpc server: listen to `ipv6 socket` if available and `--experimental-rpc-endpoint` CLI option (#4792)
  Update approval-voting-regression-bench (#5504)
  change try-runtime rpc domains (#5443)
  polkadot-parachain-bin: Remove contracts parachain (#5471)
  Add feature to allow Aura collator to use full PoV size (#5393)
  Adding stkd bootnodes (#5470)
  Make `PendingConfigs` storage item public (#5467)
  frame-omni-bencher maintenance (#5466)
  ...
@ordian ordian requested a review from eskimor August 29, 2024 09:25
@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: test-linux-stable 1/3
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7178925

@ordian ordian added this pull request to the merge queue Aug 29, 2024
Merged via the queue into master with commit ddd58c1 Aug 29, 2024
187 of 188 checks passed
@ordian ordian deleted the ao-enact-candidate-weight branch August 29, 2024 13:01
github-merge-queue bot pushed a commit that referenced this pull request Sep 4, 2024
Follow-up to #5270. The baseline numbers for Westend were too high to be
representative of the reality as it seemed to do with how multi-variate
linear regression is calculated.

---------

Co-authored-by: command-bot <>
@alindima alindima mentioned this pull request Oct 15, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T8-polkadot This PR/Issue is related to/affects the Polkadot network. T12-benchmarks This PR/Issue is related to benchmarking and weights.
Projects
Status: Completed
Development

Successfully merging this pull request may close these issues.

5 participants